Project’s Objective

This project aims to replicate the findings of the paper by (Lima and Delen 2020) published in Government Information Quarterly (ABDC: A, ABS: 3, Q1). However, it is important to note that the original dataset used by the authors was not made publicly available. As a result, I had to collect and reorganize the data from their respective sources. Unfortunately, some variables referenced in the paper are no longer accessible, which prevented their inclusion in this replication. Consequently, the results of this project differ significantly from those reported in the original study. \

As highlighted in the literature (Moody, Keister, and Ramos 2022), replicating social science research is often challenging due to factors like data unavailability. Despite these challenges, this project showcases critical data modeling techniques, including acquiring data from multiple sources, merging, manipulating, and transforming data, and applying machine learning methods.

Theoretical Contributions:

The theoretical contributions of this paper are twofold. First, it employs multiple prediction models alongside a heuristic method to assess variable importance, based on the ratio of candidate splits to splits in the Random Forest’s statistical output. This approach provides a nuanced understanding of variable significance. Second, the paper makes a notable contribution by using machine learning techniques to identify potential predictors of corruption at the country level, rather than the more commonly analyzed regional level.

Quantitative Replication

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(haven)
library(readxl)
library(readr)
library(janitor)
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
library(wbstats)

df1<-read_csv(file.choose()) ##Ease of doing Business Data to Extract the variables' codes
## New names:
## Rows: 41322 Columns: 22
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (4): Country Name, Country Code, Indicator Name, Indicator Code dbl (17): 2003,
## 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, ... lgl (1): ...22
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -> `...22`
df1<-df1 %>% clean_names() ## Refining variables' names
indicators_codes<-c(unique(df1$indicator_code)) ## Extract variables' codes
DB_data<-wb_data(indicators_codes) ## Retrieving variables" values from World bank's API
efi.df<-read_excel(file.choose()) ## Loading Economic Freedom Index (Heritage Foundation)
cpi.df<-read_excel(file.choose()) ## Loading Corruption Perception Index (Transparency International)
educ.df<-read_excel(file.choose())

The datasets are gathered from different sources and they do not follow a uniform organizational structure; I have to make them all follow the same panel structure before merging them.

#Data Preparation
## I begin with the dataset that needs serious reshaping : cpi.df
cpi.df<-cpi.df %>% clean_names() #We refine names first to ease the next procedures
cpi_long <- cpi.df %>%
  pivot_longer(
    cols = starts_with("x"),        # Select all columns starting with "x"
    names_to = "year",              # New column name for years
    values_to = "value"             # New column name for values
  ) %>%
  mutate(
    year = as.integer(gsub("x", "", year))  # Remove "x" and convert to integer
  ) %>%
  select(economy_iso3, economy_name, indicator_id, indicator, attribute_1, attribute_2, attribute_3, partner, year, value)
cpi_long<- cpi_long %>% select(economy_iso3,economy_name,year,value)
### renaming columns for standardization pruposes
# Rename columns in cpi_long
cpi_long <- cpi_long %>%
  rename(
    iso3 = economy_iso3,
    country = economy_name,
    cpi = value
  )
## efi.df is organized in a panel strcuture but it lacks an important variable, namely, country codes, this variable is needed to merge the datasets. It must be incorporated.
## The variables' names must refined first
efi.df<-efi.df %>% clean_names()
library(countrycode)
efi.df$iso3<-NULL
efi.df <- efi.df %>%
  mutate(iso3 = countrycode(country, "country.name", "iso3c"))
efi.df <- efi.df %>%
  mutate(iso3 = ifelse(country == "Kosovo", "XKX", iso3),  # Assign XKX to Kosovo
         iso3 = ifelse(country == "Micronesia", "FSM", iso3))  # Assign FSM to Micronesia
## Education index dataset must be converted into long format 
educ_long <- educ.df %>%
  pivot_longer(
    cols = -country,   # Specify that all columns except 'country' should be gathered
    names_to = "year", # The name of the new column that will hold the year values
    values_to = "educ_index" # The name of the new column that will hold the corresponding values
  )
### country codes (iso3c) is missing : 
educ_long <- educ_long %>%
  mutate(iso3 = countrycode(country, "country.name", "iso3c"))
educ_long <- educ_long %>%
  mutate(iso3 = ifelse(country == "Kosovo", "XKX", iso3),  # Assign XKX to Kosovo
         iso3 = ifelse(country == "Micronesia", "FSM", iso3),
         iso3 = ifelse(country == "Chili", "CHL", iso3),
         iso3 = ifelse(country == "Monte Negro", "MNE", iso3) 
                       )  # Assign FSM to Micronesia

## The 3 Datasets are pretty muched ready for merging now, but not all variables are needed from DB_data, only the needed ones must be selected
variables<-c("IC.REG.STRT.BUS.DFRN","IC.REG.COST.PC.MA.ZS","IC.REG.PROC.MA.NO",
             "IC.REG.DURS.MA.DY","IC.REG.DURS.FE.DY","IC.REG.PROC.FE.NO",
             "IC.REG.COST.PC.FE.ZS","IC.REG.MIN.CAP","IC.CNST.PRMT.DFRN.DB1619",
             "IC.CNST.PRMT.DFRN.DB0615","IC.CNST.PRMT.PROC.NO","IC.CNST.PRMT.TM.DY",
             "IC.CNST.PRMT.COST.WRH.VAL","IC.DCP.BQC.XD.015.DB1619","IC.CNST.PRMT.QBR.XD.02.DB1619",
             "IC.CNST.PRMT.QCBC.XD.01.DB1619","IC.CNST.PRMT.QCDC.XD.03.DB1619",
             "IC.CNST.PRMT.QCAC.XD.DB1619","IC.CNST.LIR.XD.02.DB1619","IC.CNST.PC.XD.04.DB1619",
             "IC.ELC.ACES.DFRN.DB1015","IC.ELC.ACES.DFRN.DB1619","IC.ELC.PROC.NO","IC.ELC.TIME",
             "IC.ELC.ACS.COST","IC.ELC.RSTT.XD.08.DB1619","IC.ELC.OUTG.FREQ.DURS.03.DB1619",
             "IC.ELC.MONT.OUTG.01.DB1619","IC.ELC.RSTOR.01.DB1619","IC.ELC.REGU.MONT.01.DB1619",
             "IC.ELC.LMTG.OUTG.01.DB1619","IC.ELC.COMM.TRFF.CG.01.DB1619",
             "IC.REG.PRRT.DFRN.DB0515","IC.REG.PRRT.DFRN.DB1719","IC.REG.PRRT.COST.PRT.VAL",
             "IC.REG.PRRT.DURS.TM","IC.REG.PRRT.PROC.NO","IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16",
             "IC.REG.PRRT.RELI.INFR.XD.09.DB1619","IC.REG.PRRT.TRAP.INFO.XD.06.DB1619",
             "IC.REG.PRRT.GEO.COVR.XD.08.DB1619","IC.REG.PRRT.LAND.DISP.XD.08.DB1619",
             "IC.REG.PRRT.EQACCS.XD.08.DB1619","IC.CRED.ACC.CRD.DB0514.DFRN",
             "IC.CRED.ACC.CRD.DB1519.DFRN","IC.CRED.ACC.LGL.RGHT.XD.012.DB1519",
             "IC.CRED.ACC.DPTH.CISI.XD.08.DB1519","IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS",
             "IC.CRED.ACC.PRVT.CRD.ZS","IC.CRED.ACC.ACES.DB1519","IC.CRED.ACC.ACES.DB0514",
             "PROT.MINOR.INV.DFRN.DB1519","PROT.MINOR.INV.DFRN.DB0614",
             "PROT.MINOR.INV.EXT.BUS.DISC.010.XD","PROT.MINOR.INV.IC.PRIN.EXT.DIR.LGL.010.XD",
             "PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB1519",
             "PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB0614",
             "PROT.MINOR.INV.EXT.SHARE.RTS.XD.010.DB1519",
             "PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519",
             "PROT.MINOR.INV.EXT.CORP.TRANP.XD.0010.DB1519",
             "PROT.MINOR.INV.STRENG.MIN.INV.PROT.XD.010.DB0614",
             "PAY.TAX.DB1719.DRFN","PAY.TAX.DB0616.DFRN","PAY.TAX.PYMT.FREQ.NO","PAY.TAX.TM",
             "PAY.TAX.TOT.TAX.RT.ZS","PAY.TAX.PRFT.CP.ZS","PAY.TAX.LABR.TAX.CONTR.ZS",
             "OTHR.TAX.PAID.ZS","PAY.TAX.COIT.AU.HRS.DB1719","PAY.TAX.COIT.AU.WKS.DB1719",
             "PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN","TRD.ACRS.BRDR.DB1619.DFRN",
             "TRD.ACRS.BRDR.DB0615.DFRN","TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN",
             "TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN",
             "TRD.ACRS.BRDR.EXPT.TM.BRDR.COMP.HR.DB1619.DFRN",
             "TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN",
             "TRD.ACRS.BRDR.EXPT.COST.DOC.COMP.CD.DB1619",
             "TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619",
             "ENF.CONT.COEN.DB0415.DFRN","ENF.CONT.COEN.DB1719.DFRN",
             "ENF.CONT.DURS.DY","ENF.CONT.COEN.FLSR.DY","ENF.CONT.COEN.TRJU.DY",
             "ENF.CONT.COEN.ENJU.DY","ENF.CONT.COEN.COST.ZS","ENF.CONT.COEN.ATFE.PR",
             "ENF.CONT.COEN.CTFE.PR","ENF.CONT.COEN.ENFE.PR","ENF.CONT.COEN.QUJP.XD",
             "ENF.CONT.COEN.CTSP.DB1719","ENF.CONT.COEN.CSMG","ENF.CONT.COEN.CTAU",
             "ENF.CONT.COEN.ATDR","RESLV.ISV.DB1519.DFRN","RESLV.ISV.RCOV.RT",
             "RESLV.ISV.SOIF.06.DB1519","RESLV.ISV.COPR.03.XD.DB1519","RESLV.ISV.MGDA.XD.DB1519",
             "RESLV.ISV.ROPC.03.XD.DB1519","RESLV.ISV.CPI.04.XD.DB1519",
             "IC.BUS.EASE.DFRN.XQ.DB1719","IC.BUS.EASE.DFRN.DB16","IC.BUS.EASE.DFRN.DB1014"
             )
data.db<-DB_data %>% select(iso3c,country,date,all_of(variables)) #selecting the variables needed
data.db <- data.db %>%
  mutate(dealing_w_construct = coalesce(IC.CNST.PRMT.DFRN.DB0615, IC.CNST.PRMT.DFRN.DB1619))
data.db <- data.db %>%
  mutate(getting_electricity = coalesce(IC.ELC.ACES.DFRN.DB1015,IC.ELC.ACES.DFRN.DB1619))
data.db <- data.db %>%
  mutate(registering_property = coalesce(IC.REG.PRRT.DFRN.DB0515,IC.REG.PRRT.DFRN.DB1719))
data.db <- data.db %>%
  mutate(getting_credit = coalesce(IC.CRED.ACC.CRD.DB0514.DFRN,IC.CRED.ACC.CRD.DB1519.DFRN))
data.db <- data.db %>%
  mutate(protecting_minority = coalesce(PROT.MINOR.INV.DFRN.DB0614,PROT.MINOR.INV.DFRN.DB1519))
data.db <- data.db %>%
  mutate(paying_taxes = coalesce(PAY.TAX.DB0616.DFRN,PAY.TAX.DB1719.DRFN))
data.db <- data.db %>%
  mutate(trading_borders = coalesce(TRD.ACRS.BRDR.DB0615.DFRN,TRD.ACRS.BRDR.DB1619.DFRN))
data.db <- data.db %>%
  mutate(enforcing_contracts = coalesce(ENF.CONT.COEN.DB0415.DFRN,ENF.CONT.COEN.DB1719.DFRN))
data.db <- data.db %>%
  mutate(overall_score_db = coalesce(IC.BUS.EASE.DFRN.DB1014,
                                     IC.BUS.EASE.DFRN.DB16,
                                     IC.BUS.EASE.DFRN.XQ.DB1719))
data.db2<-data.db ## We may need it later
## Some variables need to be eliinated after getting the new ones
# Vector of variables to be removed
vars_to_remove <- c(
  "IC.CNST.PRMT.DFRN.DB0615", 
  "IC.CNST.PRMT.DFRN.DB1619", 
  "IC.ELC.ACES.DFRN.DB1015", 
  "IC.ELC.ACES.DFRN.DB1619",
  "IC.REG.PRRT.DFRN.DB0515",
  "IC.REG.PRRT.DFRN.DB1719",
  "IC.CRED.ACC.CRD.DB0514.DFRN",
  "IC.CRED.ACC.CRD.DB1519.DFRN",
  "PROT.MINOR.INV.DFRN.DB0614",
  "PROT.MINOR.INV.DFRN.DB1519",
  "PAY.TAX.DB0616.DFRN",
  "PAY.TAX.DB1719.DRFN",
  "TRD.ACRS.BRDR.DB0615.DFRN",
  "TRD.ACRS.BRDR.DB1619.DFRN",
  "ENF.CONT.COEN.DB0415.DFRN",
  "ENF.CONT.COEN.DB1719.DFRN",
  "IC.BUS.EASE.DFRN.DB1014",
  "IC.BUS.EASE.DFRN.DB16",
  "IC.BUS.EASE.DFRN.XQ.DB1719"
)

### Remove the variables from the dataset using the vector
data.db2 <- data.db2 %>%
  select(-all_of(vars_to_remove))
data.db2<-data.db2 %>% rename(year = date)
data.db2<-data.db2 %>% rename(iso3 = iso3c)
research_data<-merge(data.db2,efi.df, by = c("iso3","year"))
research_data<-merge(research_data,cpi_long,by = c("iso3","year"))
research_data<-merge(research_data,educ_long[,-1],by = c("iso3","year"))
# Convert all columns to numeric except specified columns
research_data[, !(names(research_data) %in% c("country.x", "country.y", "iso3", "country"))] <- 
  lapply(research_data[, !(names(research_data) %in% c("country.x", "country.y", "iso3", "country"))], as.numeric)
research_data[research_data == "N/A"] <- NA ## replace "N/A" with standard "NA" so it can be recognized by R
research_data[research_data == "NaN"] <- NA
## We impute the missing data; the paper indicated that multivariate normal impitation was used
library(missRanger)
imputed_data<-missRanger(research_data,verbose=1)
## 
## Variables to impute:     government_integrity, business_freedom, labor_freedom, monetary_freedom, property_rights, government_spending, investment_freedom, cpi, trade_freedom, tax_burden, financial_freedom, overall_score, IC.REG.MIN.CAP, PROT.MINOR.INV.EXT.BUS.DISC.010.XD, PROT.MINOR.INV.IC.PRIN.EXT.DIR.LGL.010.XD, RESLV.ISV.SOIF.06.DB1519, RESLV.ISV.COPR.03.XD.DB1519, RESLV.ISV.MGDA.XD.DB1519, RESLV.ISV.ROPC.03.XD.DB1519, RESLV.ISV.CPI.04.XD.DB1519, protecting_minority, IC.REG.STRT.BUS.DFRN, IC.REG.PROC.MA.NO, IC.REG.DURS.MA.DY, IC.REG.DURS.FE.DY, IC.REG.PROC.FE.NO, IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS, IC.CRED.ACC.PRVT.CRD.ZS, ENF.CONT.DURS.DY, ENF.CONT.COEN.FLSR.DY, ENF.CONT.COEN.TRJU.DY, ENF.CONT.COEN.ENJU.DY, ENF.CONT.COEN.COST.ZS, ENF.CONT.COEN.ATFE.PR, ENF.CONT.COEN.CTFE.PR, ENF.CONT.COEN.ENFE.PR, RESLV.ISV.DB1519.DFRN, RESLV.ISV.RCOV.RT, dealing_w_construct, getting_electricity, registering_property, getting_credit, paying_taxes, trading_borders, enforcing_contracts, overall_score_db, PAY.TAX.PYMT.FREQ.NO, PAY.TAX.TM, PAY.TAX.TOT.TAX.RT.ZS, PAY.TAX.PRFT.CP.ZS, PAY.TAX.LABR.TAX.CONTR.ZS, OTHR.TAX.PAID.ZS, IC.ELC.PROC.NO, IC.ELC.ACS.COST, IC.REG.PRRT.COST.PRT.VAL, IC.REG.PRRT.DURS.TM, IC.REG.PRRT.PROC.NO, IC.CNST.PRMT.PROC.NO, IC.CNST.PRMT.TM.DY, IC.CNST.PRMT.COST.WRH.VAL, IC.REG.COST.PC.MA.ZS, IC.REG.COST.PC.FE.ZS, PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB1519, PROT.MINOR.INV.EXT.SHARE.RTS.XD.010.DB1519, PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519, PROT.MINOR.INV.EXT.CORP.TRANP.XD.0010.DB1519, PROT.MINOR.INV.STRENG.MIN.INV.PROT.XD.010.DB0614, IC.CRED.ACC.LGL.RGHT.XD.012.DB1519, IC.CRED.ACC.DPTH.CISI.XD.08.DB1519, IC.CRED.ACC.ACES.DB1519, IC.ELC.TIME, TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.EXPT.TM.BRDR.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.EXPT.COST.DOC.COMP.CD.DB1619, TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619, IC.ELC.RSTT.XD.08.DB1619, IC.ELC.OUTG.FREQ.DURS.03.DB1619, IC.ELC.MONT.OUTG.01.DB1619, IC.ELC.RSTOR.01.DB1619, IC.ELC.REGU.MONT.01.DB1619, IC.ELC.LMTG.OUTG.01.DB1619, IC.ELC.COMM.TRFF.CG.01.DB1619, IC.DCP.BQC.XD.015.DB1619, IC.CNST.PRMT.QBR.XD.02.DB1619, IC.CNST.PRMT.QCBC.XD.01.DB1619, IC.CNST.PRMT.QCDC.XD.03.DB1619, IC.CNST.PRMT.QCAC.XD.DB1619, IC.CNST.LIR.XD.02.DB1619, IC.CNST.PC.XD.04.DB1619, judicial_effectiveness, fiscal_health, ENF.CONT.COEN.QUJP.XD, ENF.CONT.COEN.CTSP.DB1719, ENF.CONT.COEN.CSMG, ENF.CONT.COEN.CTAU, ENF.CONT.COEN.ATDR, IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16, IC.REG.PRRT.RELI.INFR.XD.09.DB1619, IC.REG.PRRT.TRAP.INFO.XD.06.DB1619, IC.REG.PRRT.GEO.COVR.XD.08.DB1619, IC.REG.PRRT.LAND.DISP.XD.08.DB1619, IC.REG.PRRT.EQACCS.XD.08.DB1619, PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN, PAY.TAX.COIT.AU.HRS.DB1719, PAY.TAX.COIT.AU.WKS.DB1719, PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB0614, IC.CRED.ACC.ACES.DB0514
## Variables used to impute:    iso3, year, country.x, IC.REG.STRT.BUS.DFRN, IC.REG.COST.PC.MA.ZS, IC.REG.PROC.MA.NO, IC.REG.DURS.MA.DY, IC.REG.DURS.FE.DY, IC.REG.PROC.FE.NO, IC.REG.COST.PC.FE.ZS, IC.REG.MIN.CAP, IC.CNST.PRMT.PROC.NO, IC.CNST.PRMT.TM.DY, IC.CNST.PRMT.COST.WRH.VAL, IC.DCP.BQC.XD.015.DB1619, IC.CNST.PRMT.QBR.XD.02.DB1619, IC.CNST.PRMT.QCBC.XD.01.DB1619, IC.CNST.PRMT.QCDC.XD.03.DB1619, IC.CNST.PRMT.QCAC.XD.DB1619, IC.CNST.LIR.XD.02.DB1619, IC.CNST.PC.XD.04.DB1619, IC.ELC.PROC.NO, IC.ELC.TIME, IC.ELC.ACS.COST, IC.ELC.RSTT.XD.08.DB1619, IC.ELC.OUTG.FREQ.DURS.03.DB1619, IC.ELC.MONT.OUTG.01.DB1619, IC.ELC.RSTOR.01.DB1619, IC.ELC.REGU.MONT.01.DB1619, IC.ELC.LMTG.OUTG.01.DB1619, IC.ELC.COMM.TRFF.CG.01.DB1619, IC.REG.PRRT.COST.PRT.VAL, IC.REG.PRRT.DURS.TM, IC.REG.PRRT.PROC.NO, IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16, IC.REG.PRRT.RELI.INFR.XD.09.DB1619, IC.REG.PRRT.TRAP.INFO.XD.06.DB1619, IC.REG.PRRT.GEO.COVR.XD.08.DB1619, IC.REG.PRRT.LAND.DISP.XD.08.DB1619, IC.REG.PRRT.EQACCS.XD.08.DB1619, IC.CRED.ACC.LGL.RGHT.XD.012.DB1519, IC.CRED.ACC.DPTH.CISI.XD.08.DB1519, IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS, IC.CRED.ACC.PRVT.CRD.ZS, IC.CRED.ACC.ACES.DB1519, IC.CRED.ACC.ACES.DB0514, PROT.MINOR.INV.EXT.BUS.DISC.010.XD, PROT.MINOR.INV.IC.PRIN.EXT.DIR.LGL.010.XD, PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB1519, PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB0614, PROT.MINOR.INV.EXT.SHARE.RTS.XD.010.DB1519, PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519, PROT.MINOR.INV.EXT.CORP.TRANP.XD.0010.DB1519, PROT.MINOR.INV.STRENG.MIN.INV.PROT.XD.010.DB0614, PAY.TAX.PYMT.FREQ.NO, PAY.TAX.TM, PAY.TAX.TOT.TAX.RT.ZS, PAY.TAX.PRFT.CP.ZS, PAY.TAX.LABR.TAX.CONTR.ZS, OTHR.TAX.PAID.ZS, PAY.TAX.COIT.AU.HRS.DB1719, PAY.TAX.COIT.AU.WKS.DB1719, PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN, TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.EXPT.TM.BRDR.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN, TRD.ACRS.BRDR.EXPT.COST.DOC.COMP.CD.DB1619, TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619, ENF.CONT.DURS.DY, ENF.CONT.COEN.FLSR.DY, ENF.CONT.COEN.TRJU.DY, ENF.CONT.COEN.ENJU.DY, ENF.CONT.COEN.COST.ZS, ENF.CONT.COEN.ATFE.PR, ENF.CONT.COEN.CTFE.PR, ENF.CONT.COEN.ENFE.PR, ENF.CONT.COEN.QUJP.XD, ENF.CONT.COEN.CTSP.DB1719, ENF.CONT.COEN.CSMG, ENF.CONT.COEN.CTAU, ENF.CONT.COEN.ATDR, RESLV.ISV.DB1519.DFRN, RESLV.ISV.RCOV.RT, RESLV.ISV.SOIF.06.DB1519, RESLV.ISV.COPR.03.XD.DB1519, RESLV.ISV.MGDA.XD.DB1519, RESLV.ISV.ROPC.03.XD.DB1519, RESLV.ISV.CPI.04.XD.DB1519, dealing_w_construct, getting_electricity, registering_property, getting_credit, protecting_minority, paying_taxes, trading_borders, enforcing_contracts, overall_score_db, country.y, overall_score, property_rights, government_integrity, judicial_effectiveness, tax_burden, government_spending, fiscal_health, business_freedom, labor_freedom, monetary_freedom, trade_freedom, investment_freedom, financial_freedom, country, cpi, educ_index
## 
## iter 1 
##   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
## iter 2 
##   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
## iter 3 
##   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%

Descriptive Statistics :

# Descriptive Statistics
library(modelsummary)
## `modelsummary` 2.0.0 now uses `tinytable` as its default table-drawing
##   backend. Learn more at: https://vincentarelbundock.github.io/tinytable/
## 
## Revert to `kableExtra` for one session:
## 
##   options(modelsummary_factory_default = 'kableExtra')
##   options(modelsummary_factory_latex = 'kableExtra')
##   options(modelsummary_factory_html = 'kableExtra')
## 
## Silence this message forever:
## 
##   config_modelsummary(startup_message = FALSE)
desc_stats <- datasummary_skim(imputed_data[,-c(1:3,113)])
desc_stats
Unique Missing Pct. Mean SD Min Median Max Histogram
IC.REG.STRT.BUS.DFRN 2100 0 80.2 13.2 15.5 83.2 100.0
IC.REG.COST.PC.MA.ZS 1313 0 25.9 38.3 0.0 12.9 393.0
IC.REG.PROC.MA.NO 635 0 7.6 2.9 1.0 7.2 20.0
IC.REG.DURS.MA.DY 771 0 23.4 28.9 0.5 15.0 690.0
IC.REG.DURS.FE.DY 774 0 23.5 28.9 0.5 15.4 690.0
IC.REG.PROC.FE.NO 635 0 7.7 3.0 1.0 7.5 20.0
IC.REG.COST.PC.FE.ZS 1314 0 26.1 38.6 0.0 12.9 393.0
IC.REG.MIN.CAP 931 0 35.1 269.8 0.0 0.3 7445.4
IC.CNST.PRMT.PROC.NO 683 0 15.3 4.1 7.0 15.0 44.0
IC.CNST.PRMT.TM.DY 918 0 171.3 78.5 27.5 161.0 677.0
IC.CNST.PRMT.COST.WRH.VAL 906 0 6.5 9.2 0.0 3.3 79.1
IC.DCP.BQC.XD.015.DB1619 1033 0 10.3 2.6 1.0 11.0 15.0
IC.CNST.PRMT.QBR.XD.02.DB1619 923 0 1.6 0.5 0.0 1.9 2.0
IC.CNST.PRMT.QCBC.XD.01.DB1619 552 0 0.9 0.3 0.0 1.0 1.0
IC.CNST.PRMT.QCDC.XD.03.DB1619 967 0 1.7 0.7 0.0 2.0 3.0
IC.CNST.PRMT.QCAC.XD.DB1619 810 0 2.7 0.6 0.0 3.0 3.0
IC.CNST.LIR.XD.02.DB1619 1001 0 0.8 0.6 0.0 0.9 2.0
IC.CNST.PC.XD.04.DB1619 999 0 2.6 1.3 0.0 2.8 4.0
IC.ELC.PROC.NO 638 0 5.2 1.3 2.0 5.0 10.0
IC.ELC.TIME 965 0 94.5 56.1 7.0 82.2 482.0
IC.ELC.ACS.COST 1967 0 1381.8 2645.1 0.0 421.7 34090.5
IC.ELC.RSTT.XD.08.DB1619 989 0 4.0 3.0 0.0 5.0 8.0
IC.ELC.OUTG.FREQ.DURS.03.DB1619 895 0 1.2 1.2 0.0 1.0 3.0
IC.ELC.MONT.OUTG.01.DB1619 647 0 0.7 0.4 0.0 1.0 1.0
IC.ELC.RSTOR.01.DB1619 677 0 0.7 0.4 0.0 1.0 1.0
IC.ELC.REGU.MONT.01.DB1619 757 0 0.8 0.4 0.0 1.0 1.0
IC.ELC.LMTG.OUTG.01.DB1619 949 0 0.5 0.4 0.0 0.4 1.0
IC.ELC.COMM.TRFF.CG.01.DB1619 796 0 0.8 0.4 0.0 1.0 1.0
IC.REG.PRRT.COST.PRT.VAL 777 0 5.6 3.8 0.0 5.0 28.0
IC.REG.PRRT.DURS.TM 792 0 46.7 48.9 1.0 36.5 319.0
IC.REG.PRRT.PROC.NO 641 0 6.0 2.0 1.0 6.0 14.0
IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16 1243 0 14.3 7.0 2.5 14.0 28.5
IC.REG.PRRT.RELI.INFR.XD.09.DB1619 1184 0 4.0 2.7 0.0 4.1 8.0
IC.REG.PRRT.TRAP.INFO.XD.06.DB1619 1196 0 2.8 1.2 0.0 3.0 6.0
IC.REG.PRRT.GEO.COVR.XD.08.DB1619 1096 0 2.6 2.8 0.0 1.4 8.0
IC.REG.PRRT.LAND.DISP.XD.08.DB1619 1185 0 5.0 1.3 0.5 5.0 8.0
IC.REG.PRRT.EQACCS.XD.08.DB1619 514 0 -0.1 0.2 -1.0 0.0 0.0
IC.CRED.ACC.LGL.RGHT.XD.012.DB1519 786 0 5.0 2.7 0.0 5.0 12.0
IC.CRED.ACC.DPTH.CISI.XD.08.DB1519 782 0 5.1 2.9 0.0 6.4 8.0
IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS 986 0 12.1 20.8 0.0 2.4 100.0
IC.CRED.ACC.PRVT.CRD.ZS 1060 0 33.2 35.5 0.0 19.8 100.0
IC.CRED.ACC.ACES.DB1519 794 0 10.1 4.3 0.0 10.0 20.0
IC.CRED.ACC.ACES.DB0514 1779 0 9.2 3.0 1.0 9.2 16.0
PROT.MINOR.INV.EXT.BUS.DISC.010.XD 590 0 5.8 2.2 0.0 6.0 10.0
PROT.MINOR.INV.IC.PRIN.EXT.DIR.LGL.010.XD 589 0 4.6 2.3 0.0 4.8 10.0
PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB1519 779 0 6.0 1.8 0.0 6.0 10.0
PROT.MINOR.INV.EASE.SHARE.LGL.XD.010.DB0614 1750 0 5.6 1.6 0.0 6.0 10.0
PROT.MINOR.INV.EXT.SHARE.RTS.XD.010.DB1519 776 0 3.4 1.8 0.0 4.0 6.0
PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519 778 0 3.1 2.0 0.0 3.1 7.0
PROT.MINOR.INV.EXT.CORP.TRANP.XD.0010.DB1519 778 0 3.5 2.2 0.0 4.0 7.0
PROT.MINOR.INV.STRENG.MIN.INV.PROT.XD.010.DB0614 815 0 26.3 8.6 0.0 28.0 46.0
PAY.TAX.PYMT.FREQ.NO 680 0 25.5 15.9 3.0 23.8 99.0
PAY.TAX.TM 953 0 269.3 243.8 12.0 229.0 2600.0
PAY.TAX.TOT.TAX.RT.ZS 1055 0 41.3 20.3 7.4 38.1 339.1
PAY.TAX.PRFT.CP.ZS 883 0 16.2 7.9 -0.2 17.1 58.9
PAY.TAX.LABR.TAX.CONTR.ZS 903 0 17.4 10.2 0.0 16.2 54.0
OTHR.TAX.PAID.ZS 793 0 7.1 18.7 0.0 2.3 272.3
PAY.TAX.COIT.AU.HRS.DB1719 1269 0 15.6 18.0 1.0 9.2 207.5
PAY.TAX.COIT.AU.WKS.DB1719 1264 0 11.9 17.1 0.0 4.4 113.3
PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN 1378 0 57.3 25.1 0.0 56.1 100.0
TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN 1085 0 71.8 27.1 0.0 75.8 100.0
TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN 1101 0 72.6 25.8 0.0 77.7 100.0
TRD.ACRS.BRDR.EXPT.TM.BRDR.COMP.HR.DB1619.DFRN 1134 0 64.2 26.1 0.0 66.0 100.0
TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN 1139 0 71.1 24.8 0.0 74.6 100.0
TRD.ACRS.BRDR.EXPT.COST.DOC.COMP.CD.DB1619 1095 0 128.2 133.8 0.0 100.0 1800.0
TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619 1173 0 473.8 341.7 0.0 450.0 3039.0
ENF.CONT.DURS.DY 822 0 645.9 286.8 164.0 575.0 1785.0
ENF.CONT.COEN.FLSR.DY 666 0 40.1 23.2 6.0 34.6 200.0
ENF.CONT.COEN.TRJU.DY 736 0 412.8 225.9 90.0 365.0 1420.0
ENF.CONT.COEN.ENJU.DY 680 0 189.0 99.0 30.0 178.2 600.0
ENF.CONT.COEN.COST.ZS 788 0 32.9 18.5 0.1 27.9 163.2
ENF.CONT.COEN.ATFE.PR 715 0 20.5 15.0 0.0 17.3 155.7
ENF.CONT.COEN.CTFE.PR 698 0 6.3 4.3 0.1 5.3 40.2
ENF.CONT.COEN.ENFE.PR 696 0 5.6 5.0 0.0 5.0 38.3
ENF.CONT.COEN.QUJP.XD 1202 0 8.3 2.9 1.5 7.6 16.5
ENF.CONT.COEN.CTSP.DB1719 1173 0 3.3 0.9 0.0 3.3 5.0
ENF.CONT.COEN.CSMG 1178 0 1.8 1.3 0.0 1.5 5.5
ENF.CONT.COEN.CTAU 1148 0 0.9 1.0 0.0 0.5 4.0
ENF.CONT.COEN.ATDR 1156 0 2.3 0.4 0.0 2.3 3.0
RESLV.ISV.DB1519.DFRN 1757 0 46.3 22.1 0.0 42.9 93.9
RESLV.ISV.RCOV.RT 1160 0 37.9 24.1 0.0 32.6 93.1
RESLV.ISV.SOIF.06.DB1519 605 0 8.3 3.7 0.0 8.5 15.5
RESLV.ISV.COPR.03.XD.DB1519 562 0 2.4 0.5 0.0 2.5 3.0
RESLV.ISV.MGDA.XD.DB1519 588 0 4.0 1.5 0.0 4.0 6.0
RESLV.ISV.ROPC.03.XD.DB1519 579 0 0.9 1.0 0.0 0.5 3.0
RESLV.ISV.CPI.04.XD.DB1519 582 0 1.5 0.9 0.0 1.0 4.0
dealing_w_construct 2035 0 63.0 15.5 0.0 66.2 91.6
getting_electricity 2057 0 68.5 18.0 0.0 71.5 100.0
registering_property 2064 0 63.5 16.0 0.0 63.7 99.9
getting_credit 636 0 53.8 21.5 0.0 55.0 100.0
protecting_minority 638 0 52.5 16.8 0.0 55.1 96.7
paying_taxes 1712 0 67.9 16.4 0.0 69.6 100.0
trading_borders 1498 0 68.4 20.6 0.0 69.8 100.0
enforcing_contracts 1235 0 56.1 13.1 3.6 57.0 89.2
overall_score_db 2128 0 61.8 13.0 20.0 61.8 89.5
overall_score 504 0 60.3 10.2 24.7 59.4 89.7
property_rights 549 0 47.9 22.7 0.2 45.0 100.0
government_integrity 544 0 41.8 20.2 5.0 36.3 99.5
judicial_effectiveness 1529 0 45.1 19.1 3.9 42.1 98.0
tax_burden 518 0 77.8 11.7 37.2 79.1 100.0
government_spending 716 0 64.9 22.4 0.0 69.9 97.0
fiscal_health 1548 0 64.2 25.5 0.0 69.8 100.0
business_freedom 616 0 63.7 15.5 10.0 63.9 99.9
labor_freedom 576 0 59.8 14.7 20.0 59.5 98.5
monetary_freedom 372 0 74.6 9.6 0.0 75.9 91.7
trade_freedom 392 0 74.1 10.8 0.0 75.0 95.0
investment_freedom 54 0 54.6 22.6 0.0 60.0 95.0
financial_freedom 67 0 48.1 18.6 0.0 50.0 90.0
cpi 121 0 42.6 18.6 8.0 38.0 92.0
educ_index 618 0 0.6 0.2 0.2 0.7 1.0

Random Forest Algorithm :

# Set seed for reproducibility
library(randomForest)
## randomForest 4.7-1.1
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following object is masked from 'package:ggplot2':
## 
##     margin
library(caret)
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift
library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
set.seed(9000)
## DV editing : the paper divided it into 4 categories
# Create a new variable 'cpi_level' based on fixed ranges (quartiles)
imputed_data <- imputed_data %>%
  mutate(cpi_level = cut(cpi, 
                         breaks = c(0, 25, 50, 75, 100), 
                         labels = c(0, 1, 2, 3),
                         right = TRUE, include.lowest = TRUE))

# Check the distribution of the new variable
table(imputed_data$cpi_level)
## 
##    0    1    2    3 
##  318 1214  435  161
# Define control for 10-fold cross-validation
control <- trainControl(method = "cv", number = 10)  # 10-fold cross-validation
# Train the Random Forest model using 10-fold cross-validation
rf_model <- train(
  cpi_level ~ .,          # Formula for the model (outcome ~ predictors)
  data = imputed_data[,-c(1:3,99,113,114)],         # Dataset
  method = "rf",                # Random Forest method
  trControl = control,          # Control for cross-validation
  importance = TRUE,            # To calculate variable importance
  ntree = 300                   # Number of trees (you can adjust)
)
# Print model summary
print(rf_model)
## Random Forest 
## 
## 2128 samples
##  109 predictor
##    4 classes: '0', '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 1915, 1916, 1916, 1915, 1917, 1915, ... 
## Resampling results across tuning parameters:
## 
##   mtry  Accuracy   Kappa    
##     2   0.9313614  0.8843758
##    55   0.9214845  0.8682596
##   109   0.9219672  0.8693415
## 
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 2.
# Get variable importance
# If the importance is split across multiple classes, calculate the mean importance
var_imp <- varImp(rf_model, scale = FALSE)
var_imp_df <- as.data.frame(var_imp$importance)
var_imp_df$Overall <- rowMeans(var_imp_df)  # Averaging across all classes
# Calculate total importance (sum of overall importance)
total_importance <- sum(var_imp_df$Overall)
# Calculate heuristic importance for each variable
var_imp_df$heuristic_importance <- var_imp_df$Overall / total_importance
# Sort by heuristic importance and select top 20 variables
top_20_vars <- var_imp_df %>%
  arrange(desc(heuristic_importance)) %>%
  head(20)
# Add a column for variable names to top_20_vars
top_20_vars$variable <- rownames(top_20_vars)

# Create a bar plot for the top 20 important variables
importance_plot_rf <- plot_ly(
  top_20_vars, 
  x = ~variable, 
  y = ~heuristic_importance, 
  type = 'bar',
  marker = list(color = 'blue')
) %>%
layout(
  title = 'Top 20 Variable Importance (Heuristic Method)',
  xaxis = list(title = 'Variables'),
  yaxis = list(title = 'Heuristic Importance'),
  barmode = 'group'
)

# the plot
importance_plot_rf

Displaying variables’ importance in a more quantitative way to make more readible :

## Displaying top 20 predictors : 
print(top_20_vars)
##                                                       0         1         2
## government_integrity                          10.275240  9.120248  9.965555
## overall_score                                  8.752978  9.612558  9.619705
## judicial_effectiveness                         7.614339  9.485374 10.005600
## IC.CNST.PRMT.COST.WRH.VAL                      7.962846  8.964782 10.238625
## RESLV.ISV.RCOV.RT                              8.900820  8.859588  7.561705
## property_rights                                7.330619  8.855799  8.595582
## PAY.TAX.TM                                     6.147319  9.601270  9.587344
## educ_index                                     6.259592  9.153169  8.683438
## RESLV.ISV.DB1519.DFRN                          8.420564  8.633315  7.803198
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN   8.557799  9.207875  7.199237
## ENF.CONT.COEN.CTFE.PR                          7.435216  9.196429  9.860576
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN           7.315982  8.765450  9.102325
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN  8.770743 10.444583  7.240865
## IC.REG.COST.PC.FE.ZS                           7.518895 10.037614  7.040299
## financial_freedom                              8.073599  8.658639  7.315675
## IC.CNST.PRMT.TM.DY                             6.710858 10.549557  8.465423
## tax_burden                                     7.058841  9.921912  7.726866
## ENF.CONT.COEN.COST.ZS                          6.159220  9.384674  8.518290
## IC.ELC.ACS.COST                                7.411110  8.902299  7.638461
## PAY.TAX.LABR.TAX.CONTR.ZS                      7.279932  8.616527  9.380723
##                                                      3  Overall
## government_integrity                          8.667643 9.507172
## overall_score                                 7.311575 8.824204
## judicial_effectiveness                        7.632014 8.684332
## IC.CNST.PRMT.COST.WRH.VAL                     6.061821 8.307018
## RESLV.ISV.RCOV.RT                             7.614416 8.234132
## property_rights                               7.469381 8.062845
## PAY.TAX.TM                                    6.851884 8.046954
## educ_index                                    7.842426 7.984656
## RESLV.ISV.DB1519.DFRN                         6.789200 7.911569
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN  6.634510 7.899855
## ENF.CONT.COEN.CTFE.PR                         5.105533 7.899438
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN          6.222263 7.851505
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN 4.875701 7.832973
## IC.REG.COST.PC.FE.ZS                          6.729101 7.831477
## financial_freedom                             7.254660 7.825643
## IC.CNST.PRMT.TM.DY                            5.558626 7.821116
## tax_burden                                    6.395377 7.775749
## ENF.CONT.COEN.COST.ZS                         6.383151 7.611334
## IC.ELC.ACS.COST                               6.408847 7.590179
## PAY.TAX.LABR.TAX.CONTR.ZS                     4.937864 7.553761
##                                               heuristic_importance
## government_integrity                                    0.01265745
## overall_score                                           0.01174818
## judicial_effectiveness                                  0.01156196
## IC.CNST.PRMT.COST.WRH.VAL                               0.01105962
## RESLV.ISV.RCOV.RT                                       0.01096258
## property_rights                                         0.01073454
## PAY.TAX.TM                                              0.01071338
## educ_index                                              0.01063044
## RESLV.ISV.DB1519.DFRN                                   0.01053313
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN            0.01051754
## ENF.CONT.COEN.CTFE.PR                                   0.01051698
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN                    0.01045317
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN           0.01042849
## IC.REG.COST.PC.FE.ZS                                    0.01042650
## financial_freedom                                       0.01041874
## IC.CNST.PRMT.TM.DY                                      0.01041271
## tax_burden                                              0.01035231
## ENF.CONT.COEN.COST.ZS                                   0.01013341
## IC.ELC.ACS.COST                                         0.01010525
## PAY.TAX.LABR.TAX.CONTR.ZS                               0.01005676
##                                                                                    variable
## government_integrity                                                   government_integrity
## overall_score                                                                 overall_score
## judicial_effectiveness                                               judicial_effectiveness
## IC.CNST.PRMT.COST.WRH.VAL                                         IC.CNST.PRMT.COST.WRH.VAL
## RESLV.ISV.RCOV.RT                                                         RESLV.ISV.RCOV.RT
## property_rights                                                             property_rights
## PAY.TAX.TM                                                                       PAY.TAX.TM
## educ_index                                                                       educ_index
## RESLV.ISV.DB1519.DFRN                                                 RESLV.ISV.DB1519.DFRN
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN   TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN
## ENF.CONT.COEN.CTFE.PR                                                 ENF.CONT.COEN.CTFE.PR
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN                   PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN
## IC.REG.COST.PC.FE.ZS                                                   IC.REG.COST.PC.FE.ZS
## financial_freedom                                                         financial_freedom
## IC.CNST.PRMT.TM.DY                                                       IC.CNST.PRMT.TM.DY
## tax_burden                                                                       tax_burden
## ENF.CONT.COEN.COST.ZS                                                 ENF.CONT.COEN.COST.ZS
## IC.ELC.ACS.COST                                                             IC.ELC.ACS.COST
## PAY.TAX.LABR.TAX.CONTR.ZS                                         PAY.TAX.LABR.TAX.CONTR.ZS

The seconnd approach,namely, neural networks :

# Normalize predictors to ensure better performance for the neural network
nnet.data<-imputed_data[,-c(1:3,99,113,114)]
preproc <- preProcess(nnet.data, method = c("center", "scale"))
nnet.data <- predict(preproc, nnet.data)
# Ensure that the target variable 'cpi_level' is a factor (ordinal)
nnet.data$cpi_level <- factor(nnet.data$cpi_level, ordered = TRUE)
# Set seed for reproducibility
set.seed(9000)
# Control for cross-validation (you should have this defined already)
control <- trainControl(method = "cv", number = 10)
# Train the Neural Network model for ordinal classification using 10-fold cross-validation
library(nnet)
# Train the Neural Network model with a smaller number of hidden neurons
nn_model <- train(
  cpi_level ~ .,          # Formula for the model (outcome ~ predictors)
  data = nnet.data,         # Dataset
  method = "nnet",              # Neural Network method
  trControl = control,          # Control for cross-validation
  tuneGrid = expand.grid(size = c(3, 5, 7), decay = c(0.01, 0.001)), # Adjust size and decay
  trace = FALSE,                # Suppress trace output
  maxit = 200                   # Maximum number of iterations
)
# Check the model
print(nn_model)
## Neural Network 
## 
## 2128 samples
##  109 predictor
##    4 classes: '0', '1', '2', '3' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 1915, 1916, 1916, 1915, 1917, 1915, ... 
## Resampling results across tuning parameters:
## 
##   size  decay  Accuracy   Kappa    
##   3     0.001  0.8505579  0.7510291
##   3     0.010  0.8792283  0.8006477
##   5     0.001  0.8881748  0.8158830
##   5     0.010  0.8947014  0.8271003
##   7     0.001  0.8957331  0.8287865
##   7     0.010  0.8951821  0.8271879
## 
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were size = 7 and decay = 0.001.
library(NeuralNetTools)
var.imp.nn<-garson(nn_model,bar_plot = F)
# Sorting var.imp.nn in decreasing order based on rel_imp
var.imp.nn_sorted <- var.imp.nn[order(-var.imp.nn$rel_imp), , drop = FALSE]
# Add the variable names back as a column (if they were row names)
var.imp.nn_sorted$variable <- rownames(var.imp.nn_sorted)
# Creating the bar plot for variable importance
importance_plot_nn <- plot_ly(var.imp.nn_sorted, 
                              x = ~variable, 
                              y = ~rel_imp, 
                              type = 'bar',
                              marker = list(color = 'green')) %>%
  layout(title = 'Variable Importance from Neural Network',
         xaxis = list(title = 'Variables'),
         yaxis = list(title = 'Relative Importance'),
         barmode = 'group')

# the plot
importance_plot_nn

Displaying the top 20 predictors :

# Print the top 20 predictors
var.imp.nn_sorted %>%
  slice_max(order_by = rel_imp, n = 20) %>%
  print()
##                                                rel_imp
## government_integrity                        0.03311002
## paying_taxes                                0.01682678
## RESLV.ISV.RCOV.RT                           0.01667900
## ENF.CONT.COEN.CTSP.DB1719                   0.01607701
## IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS            0.01546024
## PAY.TAX.PRFT.CP.ZS                          0.01519109
## PROT.MINOR.INV.EXT.BUS.DISC.010.XD          0.01433637
## PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519 0.01410425
## IC.CNST.PRMT.PROC.NO                        0.01407838
## ENF.CONT.COEN.CTAU                          0.01383389
## business_freedom                            0.01369460
## RESLV.ISV.COPR.03.XD.DB1519                 0.01359411
## TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619  0.01306166
## getting_credit                              0.01295903
## PAY.TAX.LABR.TAX.CONTR.ZS                   0.01288461
## ENF.CONT.COEN.ATDR                          0.01255668
## IC.ELC.COMM.TRFF.CG.01.DB1619               0.01217087
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN        0.01210283
## RESLV.ISV.CPI.04.XD.DB1519                  0.01191477
## RESLV.ISV.ROPC.03.XD.DB1519                 0.01190975
##                                                                                variable
## government_integrity                                               government_integrity
## paying_taxes                                                               paying_taxes
## RESLV.ISV.RCOV.RT                                                     RESLV.ISV.RCOV.RT
## ENF.CONT.COEN.CTSP.DB1719                                     ENF.CONT.COEN.CTSP.DB1719
## IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS                       IC.CRED.ACC.PUBL.CRD.REG.COVR.ZS
## PAY.TAX.PRFT.CP.ZS                                                   PAY.TAX.PRFT.CP.ZS
## PROT.MINOR.INV.EXT.BUS.DISC.010.XD                   PROT.MINOR.INV.EXT.BUS.DISC.010.XD
## PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519 PROT.MINOR.INV.EXT.OWNR.CONT.XD.0100.DB1519
## IC.CNST.PRMT.PROC.NO                                               IC.CNST.PRMT.PROC.NO
## ENF.CONT.COEN.CTAU                                                   ENF.CONT.COEN.CTAU
## business_freedom                                                       business_freedom
## RESLV.ISV.COPR.03.XD.DB1519                                 RESLV.ISV.COPR.03.XD.DB1519
## TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619   TRD.ACRS.BRDR.IMP.COST.BRDR.COMP.CD.DB1619
## getting_credit                                                           getting_credit
## PAY.TAX.LABR.TAX.CONTR.ZS                                     PAY.TAX.LABR.TAX.CONTR.ZS
## ENF.CONT.COEN.ATDR                                                   ENF.CONT.COEN.ATDR
## IC.ELC.COMM.TRFF.CG.01.DB1619                             IC.ELC.COMM.TRFF.CG.01.DB1619
## PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN               PAY.TAX.POST.FIL.XD.0100.DB1719.DFRN
## RESLV.ISV.CPI.04.XD.DB1519                                   RESLV.ISV.CPI.04.XD.DB1519
## RESLV.ISV.ROPC.03.XD.DB1519                                 RESLV.ISV.ROPC.03.XD.DB1519

the last machine learning algorithm, support vector machine:

library(e1071)
# Set seed for reproducibility
set.seed(9000)
# Define the control function for k-fold cross-validation (k = 10)
train_control <- trainControl(method = "cv", number = 10)
# Train the SVM model
svm_model <- train(
  cpi_level ~ .,  # Formula for the model (outcome ~ predictors)
  data = nnet.data,
  method = "svmRadial",  # Radial basis function kernel for non-linear SVM
  trControl = train_control,  # Control for cross-validation
  preProcess = c("center", "scale"),  # Preprocess by centering and scaling
  tuneLength = 10  # Grid search over 10 values of the tuning parameter
)
# Print the summary of the model
print(svm_model)
## Support Vector Machines with Radial Basis Function Kernel 
## 
## 2128 samples
##  109 predictor
##    4 classes: '0', '1', '2', '3' 
## 
## Pre-processing: centered (109), scaled (109) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 1915, 1916, 1916, 1915, 1917, 1915, ... 
## Resampling results across tuning parameters:
## 
##   C       Accuracy   Kappa    
##     0.25  0.8932974  0.8166285
##     0.50  0.9031588  0.8345985
##     1.00  0.9139726  0.8537235
##     2.00  0.9261904  0.8746506
##     4.00  0.9276078  0.8777353
##     8.00  0.9271471  0.8775576
##    16.00  0.9332748  0.8886084
##    32.00  0.9346810  0.8914972
##    64.00  0.9351461  0.8924387
##   128.00  0.9323314  0.8882008
## 
## Tuning parameter 'sigma' was held constant at a value of 0.007229501
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were sigma = 0.007229501 and C = 64.
library(vip)
## 
## Attaching package: 'vip'
## The following object is masked from 'package:utils':
## 
##     vi
# Prediction wrapper function for classification outcome
predict_classification <- function(object, newdata) {
  as.factor(predict(object, newdata = newdata))  # Ensure predictions are factors for classification
}
# Get variable importance from the SVM model using caret's varImp function
# Get variable importance from the SVM model using caret's varImp function
svm_var_imp <- varImp(svm_model, scale = FALSE)

# Convert the variable importance object to a data frame
svm_var_imp_df <- as.data.frame(svm_var_imp$importance)

# Add the variable names to the data frame
svm_var_imp_df$variable <- rownames(svm_var_imp_df)

# If it's a multiclass model, calculate the mean importance across all classes
if (ncol(svm_var_imp_df) > 1) {
  svm_var_imp_df$Overall <- rowMeans(svm_var_imp_df[, -ncol(svm_var_imp_df)])
} else {
  svm_var_imp_df$Overall <- svm_var_imp_df[, 1]  # Single-class case
}

# Select the top 20 important variables
svm_var_imp_top20 <- svm_var_imp_df %>%
  top_n(20, Overall) %>%
  arrange(desc(Overall))

# Plot the top 20 important variables using plotly
library(plotly)

importance_plot_svm <- plot_ly(svm_var_imp_top20, 
                               x = ~reorder(variable, Overall), 
                               y = ~Overall, 
                               type = 'bar', 
                               marker = list(color = 'green')) %>%
  layout(title = 'Top 20 Important Variables from SVM Model',
         xaxis = list(title = 'Variables'),
         yaxis = list(title = 'Mean Importance'),
         barmode = 'group')

# Show the plot
importance_plot_svm
print(svm_var_imp_top20)
##                                                      X0        X1        X2
## government_integrity                          0.9999169 1.0000000 0.9775408
## judicial_effectiveness                        0.9949035 1.0000000 0.9391250
## property_rights                               0.9900022 1.0000000 0.9282840
## overall_score                                 0.9930456 1.0000000 0.9109129
## overall_score_db                              0.9878624 0.9994336 0.8847513
## paying_taxes                                  0.9776477 0.9965819 0.8581255
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN 0.9733969 0.9998047 0.8487976
## trading_borders                               0.9677583 0.9988867 0.8509851
## business_freedom                              0.9712174 0.9957518 0.8454383
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN  0.9744524 0.9995508 0.8337755
## IC.ELC.ACS.COST                               0.9660160 0.9961034 0.8356530
## IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16           0.9568857 0.9990625 0.8126711
## educ_index                                    0.9370021 0.9999805 0.8270191
## ENF.CONT.COEN.QUJP.XD                         0.9608979 0.9905856 0.8168291
## RESLV.ISV.RCOV.RT                             0.9254066 0.9998437 0.8335276
## TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN 0.9609232 0.9883492 0.8166269
## RESLV.ISV.DB1519.DFRN                         0.9172558 0.9969921 0.8427828
## IC.REG.PRRT.GEO.COVR.XD.08.DB1619             0.9202523 0.9963475 0.8260552
## investment_freedom                            0.9392214 0.9966503 0.8028669
## IC.REG.COST.PC.FE.ZS                          0.9489084 0.9904293 0.8033231
##                                                      X3
## government_integrity                          1.0000000
## judicial_effectiveness                        1.0000000
## property_rights                               1.0000000
## overall_score                                 1.0000000
## overall_score_db                              0.9994336
## paying_taxes                                  0.9965819
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN 0.9998047
## trading_borders                               0.9988867
## business_freedom                              0.9957518
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN  0.9995508
## IC.ELC.ACS.COST                               0.9961034
## IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16           0.9990625
## educ_index                                    0.9999805
## ENF.CONT.COEN.QUJP.XD                         0.9905856
## RESLV.ISV.RCOV.RT                             0.9998437
## TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN 0.9883492
## RESLV.ISV.DB1519.DFRN                         0.9969921
## IC.REG.PRRT.GEO.COVR.XD.08.DB1619             0.9963475
## investment_freedom                            0.9966503
## IC.REG.COST.PC.FE.ZS                          0.9904293
##                                                                                    variable
## government_integrity                                                   government_integrity
## judicial_effectiveness                                               judicial_effectiveness
## property_rights                                                             property_rights
## overall_score                                                                 overall_score
## overall_score_db                                                           overall_score_db
## paying_taxes                                                                   paying_taxes
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN
## trading_borders                                                             trading_borders
## business_freedom                                                           business_freedom
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN   TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN
## IC.ELC.ACS.COST                                                             IC.ELC.ACS.COST
## IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16                     IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16
## educ_index                                                                       educ_index
## ENF.CONT.COEN.QUJP.XD                                                 ENF.CONT.COEN.QUJP.XD
## RESLV.ISV.RCOV.RT                                                         RESLV.ISV.RCOV.RT
## TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN
## RESLV.ISV.DB1519.DFRN                                                 RESLV.ISV.DB1519.DFRN
## IC.REG.PRRT.GEO.COVR.XD.08.DB1619                         IC.REG.PRRT.GEO.COVR.XD.08.DB1619
## investment_freedom                                                       investment_freedom
## IC.REG.COST.PC.FE.ZS                                                   IC.REG.COST.PC.FE.ZS
##                                                 Overall
## government_integrity                          0.9943644
## judicial_effectiveness                        0.9835071
## property_rights                               0.9795715
## overall_score                                 0.9759896
## overall_score_db                              0.9678702
## paying_taxes                                  0.9572342
## TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN 0.9554510
## trading_borders                               0.9541292
## business_freedom                              0.9520398
## TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN  0.9518324
## IC.ELC.ACS.COST                               0.9484689
## IC.REG.PRRT.QUAL.LNDADM.XD.030.DB16           0.9419204
## educ_index                                    0.9409955
## ENF.CONT.COEN.QUJP.XD                         0.9397245
## RESLV.ISV.RCOV.RT                             0.9396554
## TRD.ACRS.BRDR.IMP.TM.BRDR.COMP.HR.DB1619.DFRN 0.9385621
## RESLV.ISV.DB1519.DFRN                         0.9385057
## IC.REG.PRRT.GEO.COVR.XD.08.DB1619             0.9347506
## investment_freedom                            0.9338472
## IC.REG.COST.PC.FE.ZS                          0.9332725

overall_score refers to the Economic Freedom Index; overall_score_db refers to Ease of Doing Business Score; “TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN” refers to Score-Time to export: Documentary compliance (hours) (DB16-20 methodology); “TRD.ACRS.BRDR.EXPT.TM.DOC.COMP.HR.DB1619.DFRN” : Trading across borders: Time to import: Documentary compliance (hours) (DB16-19 methodology);“TRD.ACRS.BRDR.IMP.TM.DOC.COMP.HR.DB1619.DFRN” refers to the number of hours required for document compliance during the import process across borders. The rest can be found by checking World Bank Glossary.

Confusion Matrix for the Random Forest model:

rf_predictions <- predict(rf_model, newdata = nnet.data)
# Generate the confusion matrix using the actual and predicted values
conf_matrix <- confusionMatrix(rf_predictions, nnet.data$cpi_level)
# Print the confusion matrix
print(conf_matrix)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    0    1    2    3
##          0  318 1214  435  161
##          1    0    0    0    0
##          2    0    0    0    0
##          3    0    0    0    0
## 
## Overall Statistics
##                                           
##                Accuracy : 0.1494          
##                  95% CI : (0.1345, 0.1653)
##     No Information Rate : 0.5705          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : 0               
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 1 Class: 2 Class: 3
## Sensitivity            1.0000   0.0000   0.0000  0.00000
## Specificity            0.0000   1.0000   1.0000  1.00000
## Pos Pred Value         0.1494      NaN      NaN      NaN
## Neg Pred Value            NaN   0.4295   0.7956  0.92434
## Prevalence             0.1494   0.5705   0.2044  0.07566
## Detection Rate         0.1494   0.0000   0.0000  0.00000
## Detection Prevalence   1.0000   0.0000   0.0000  0.00000
## Balanced Accuracy      0.5000   0.5000   0.5000  0.50000

Confusion Matrix for the Neural Net model:

nnet_predictions <- predict(nn_model, newdata = nnet.data)
# Generate the confusion matrix using the actual and predicted values
conf_matrix <- confusionMatrix(nnet_predictions, nnet.data$cpi_level)
# Print the confusion matrix
print(conf_matrix)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    0    1    2    3
##          0  307    2    0    0
##          1   11 1211    4    1
##          2    0    1  429    2
##          3    0    0    2  158
## 
## Overall Statistics
##                                           
##                Accuracy : 0.9892          
##                  95% CI : (0.9838, 0.9931)
##     No Information Rate : 0.5705          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.9821          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 1 Class: 2 Class: 3
## Sensitivity            0.9654   0.9975   0.9862  0.98137
## Specificity            0.9989   0.9825   0.9982  0.99898
## Pos Pred Value         0.9935   0.9870   0.9931  0.98750
## Neg Pred Value         0.9940   0.9967   0.9965  0.99848
## Prevalence             0.1494   0.5705   0.2044  0.07566
## Detection Rate         0.1443   0.5691   0.2016  0.07425
## Detection Prevalence   0.1452   0.5766   0.2030  0.07519
## Balanced Accuracy      0.9822   0.9900   0.9922  0.99017

Confusion Matrix for the SVM model :

# Load necessary libraries
# Predict the class labels on the training data (or you can use new test data if available)
svm_predictions <- predict(svm_model, newdata = nnet.data)
# Generate the confusion matrix using the actual and predicted values
conf_matrix <- confusionMatrix(svm_predictions, nnet.data$cpi_level)
# Print the confusion matrix
print(conf_matrix)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    0    1    2    3
##          0  310    2    0    0
##          1    8 1208    5    0
##          2    0    4  427    7
##          3    0    0    3  154
## 
## Overall Statistics
##                                           
##                Accuracy : 0.9864          
##                  95% CI : (0.9805, 0.9909)
##     No Information Rate : 0.5705          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.9774          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 1 Class: 2 Class: 3
## Sensitivity            0.9748   0.9951   0.9816  0.95652
## Specificity            0.9989   0.9858   0.9935  0.99847
## Pos Pred Value         0.9936   0.9894   0.9749  0.98089
## Neg Pred Value         0.9956   0.9934   0.9953  0.99645
## Prevalence             0.1494   0.5705   0.2044  0.07566
## Detection Rate         0.1457   0.5677   0.2007  0.07237
## Detection Prevalence   0.1466   0.5738   0.2058  0.07378
## Balanced Accuracy      0.9869   0.9904   0.9876  0.97750
Lima, Marcio Salles Melo, and Dursun Delen. 2020. “Predicting and Explaining Corruption Across Countries: A Machine Learning Approach.” Government Information Quarterly 37 (1): 101407.
Moody, James W, Lisa A Keister, and Maria C Ramos. 2022. “Reproducibility in the Social Sciences.” Annual Review of Sociology 48 (1): 65–85.